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Induction of anti-tumor cytotoxic T lymphocytes in humans using 
peptide epitopes found by computer based algorithms for vaccination 

Description 

This invention relates to a method for providing, identifying or/and 
optimizing peptides which induce cytotoxic T-Iymphocytes and to the uses 
of the thus obtained peptides, in particular, for vaccination. 

In particular, this invention relates to a method for predicting and 
optimizing peptides and peptidomimetics, based on the application of 
pattern recognition technologies such as, for example, artificial neural 
networks, in combination with a selection for the highest degree of 
conservation, in particular, phylogenetic conservation and optimization 
through amino acid exchange at the anchor positions of the MHC-binding 
peptides, and the use of the identified amino acid sequences in a peptide 
pool, e.g. together with additional helper antigens as co-stimulators for 
vaccination. 

The present invention further relates to compositions and methods for the 
treatment of cancer and the treatment or prevention of viral infections. The 
invention, in particular, provides peptides based on a 9 residue epitope 
derived from tumor-associated or viral antigens. The peptides induce 
cytotoxic T cells that destroy tumor cells and virus-infected cell. 

Further, this invention relates to computer-assisted analysis of biological 
molecules, particularly of biologically active peptides and peptide mimetics, 
and the prediction of their biological and pharmacological potencies. 
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Vaccines against tumors or viruses are based on specific antigens, in 
particular, on weakly immunogen-specific antigens, admixed to adjuvants 
in order to elicit, restore or augment immune responses against tumor cells, 
e.g. residual or metastatic tumor cells, or virus-infected cells. Cellular 
cytotoxicity is considered to play a major role in the elimination of tumor 
cells or virus-infected cells. Activation of cellular cytoxicity within an 
organism requires at least three synergistic signals: Epitopes derived from 
tumor-specific antigens presented by MHC class I molecules (HLA 
restriction), co-stimulatory signals provided by cell surface molecules of 
antigen-presenting cells (APCs), e.g. B-7.1 and B-7.2, and differentiation 
and propagation signals of cytokines. 

To activate cellular cytotoxicity it is therefore of great interest to find 
and/or provide pertinent HLA-restricted epitopes, especially also in view of 
the widespread occurrence of cancer and viral diseases. Therefore, it was 
an object of the invention to provide peptides which induce cytotoxic T- 
lymphocytes. 

According to the invention this object is achieved by a method for 
providing, identifying or/and optimizing peptides which induce cytotoxic T- 
lymphocytes, comprising the steps: 

(a) selecting one or more antigenic proteins, 

(b) selecting conserved regions within the protein sequence of 
the one or more antigenic proteins, and 

(c) identifying CD8+ T-cell epitopes within the protein sequence 
of the one or more antigenic proteins, preferably within the 
phylogenetically conserved regions. 

According to the method of the invention one or more antigenic proteins 
are selected in a first step. In particular, relevant antigenic proteins for 
various cancers or viruses are taken. The selection can be performed, for 



WO 02/072627 



PCT/EP02/02666 



- 3 - 

example, by the man skilled in the art referring to literature or references 
describing antigenic proteins associated with cancers and viruses. 

In a second step, conserved regions within the protein sequence of one or 
more antigenic proteins are determined. The determination of conserved 
regions can be effected, for example, by comparison with other proteins, 
e.g. proteins stored in a database. In step (b) according to the invention 
conserved regions, i.e. regions which are subject only to minor changes 
during evolution, are determined. The selection of conserved regions, in 
particular, has the advantage that a high response rate is achieved in 
subsequent use of the peptides for inducing cytotoxic T-lymphocytes, and 
high effectiveness against the cancer cells and viruses to be attacked. In 
contrast to highly variable regions, conserved regions change only slightly 
and, thus, represent an excellent target for combatting cancer cells or 
viruses. It is especially preferable to select phylogenetically conserved 
regions within the protein sequences of the one or more antigenic protein. 

In a further step according to the invention CD8+ T-cell epitopes are 
identified within the protein sequence of the one or more antigenic proteins 
and preferably within the conserved regions, in particular, within the 
phylogenetically conserved regions. Determination of CD8 + T-cell epitopes 
can be effected by means of pattern recognition technologies and, 
especially by using an artificial neural network (ANN). Artificial intelligence 
and pattern recognition methods have been proven to be powerful tools in 
the bioinformatics field. For example, an artificial neural network (ANN) has 
been successfully applied to predict mitochondrial precursor cleavage sites 
(G.Schneider, P.Wrede, J.Mol.Evol.36, 586 (1993) and membrane- 
spanning amin acid sequences (R.Lohmann, G.Schneider, D.Behrens and 
P.Wrede, Protein Science 3, 1597 (1994); M.Milik and J.Skolnick, in: 
"Proceedings of Fourth Annual Conference on Evolutionary Programming", 
MIT Press, La Jolla (1995)). 



WO 02/072627 PCT/EP02/02666 

- 4 - 

However, the identification of CD8+ T-cell epitopes or the prediction of 
MHC-I binding can be done by any technology available to the man skilled 
in the art. In particular, pattern recognition technologies can be applied. 
Preferably, however, an artificial neural network is used, since an ANN 
allows for prediction of MHC-I binding peptides with high accuracy. 
Particularly preferred an ANN is used which has been trained with an 
evolutionary algorithm. 

In a preferred and advantageous embodiment, the method according to the 
invention further comprises the step: 

(d) optimizing the identified CD8 + T-cell epitopes by exchanging one or 
more amino acids. 

Preferably, the amino acids are exchanged in the anchor positions of the 
epitopes, in particular, in the anchor residues of the MHC-1 binding 
peptides. Particularly preferred, said optimizing step is performed prior to 
the step of identifying CD8+ T-cell epitopes. According to the invention 
modified epitopes, too, are thus tested for their binding efficacy, as a result 
of which new effective peptides can be found. 

Optimization of the CD8+ T-cell epitopes is preferably effected by 
exchanging the amino acid present by another amino acid at one or more 
positions of the peptides. Said exchange can be effected randomly and at 
arbitrary positions. It is preferred, however, to first determine anchor 
positions and then exchange the amino acids present at said anchor 
positions. Preferably amino acids are taken in exchange which are known 
to increase binding to MHC-I at these anchor positions. 

By means of the method of the invention, in particular, peptides having a 
length of from 4-30, more preferably from 5-20, still more preferably of at 
least 6, at least 7, at least 8 or at least 9 amino acids, and up to 15, 14, 
13, 12, 11 or 10 amino acids are obtained. It is particularly preferred to 
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apply the invention to peptides having a length of 8, 9 or 10 amino acids, 
especially 9 amino acids. 

The term peptide as used herein also includes peptide mimetics which 
contain one or more non-naturally occurring amino acid, e.g. homoarginine, 
ornithine, etc. 

Selection of suitable peptides which induce cytotoxic T-Iymphocytes can 
be effected by means of the above-described procedural steps, in 
particular, by selecting the respective best candidates of each procedural 
step, e.g. the best 50%, the best 30% or the best 10%. In addition, it is 
possible to incorporate filtering steps, by means of which particular 
peptides are selected and picked out as preferred or disposed of. 

According to the invention the predicted identified or optimized epitope 
peptides can be verified by in vitro or in vivo tests, especially by in vitro 
tests. 

The peptides obtained according to the invention, finally, can be used as 
pharmaceuticals, especially as a vaccine. In particular, tumors and virus 
infections can be treated or prevented successfully by means of the 
peptides obtained according to the invention. 

Therefore, the invention further relates to a pharmaceutical composition 
comprising one or more peptides obtainable by the method described 
above. This pharmaceutical composition can comprise further adjuvants, 
co-factors and/or co-stimulating agents, e.g. recall antigens as adjuvants 
for CD4* T-cell stimulation and for induction of co-stimulation for peptide 
and disease-specific CD8* cytotoxic T-cells. Particularly preferred, the 
pharmaceutical^ composition is a vaccine, in particular, a vaccine for the 
treatment and/or prevention of cancer or viral infections. The 



WO 02/072627 PCT/EP02/02666 

- 6 - 

pharmaceutical composition can be in any suitable administration form, 
with intracutaneous and parenteral administration being preferred. 

An important and most preferred aspect of the invention is the combination 
of methods to identify peptides and the subsequent use of the peptides 
found as pharmaceutical composition, in particular, for vaccination. 
Therefore, a most preferred embodiment of the invention is a method for 
providing a pharmaceutical composition for the induction of cytotoxic T- 
lymphocytes cmoprising: 

(a) providing one or more peptides which induce cytotoxic T- 
lymphocytes according to the method described above, and 

(b) using the one or more peptides for the manufacture of a 
pharmaceutical composition. 

The invention allows, in a unique manner, to combine these two steps. In 
particular, the invention allows to actually provide pharmaceuticals, starting 
out from computer-based predictions. 

The invention further relates to the peptides discovered by means of the 
inventive method, in particular, as shown in Tables 1, 2, 3 and 4 below, as 
well as to pharmaceutical compositions containing one or more of these 
peptides or other peptides discovered by means of the method of the 
invention, in particular, at least 2, at least 3, at least 4, at least 5, at least 
10 or at least 20 and up to 100, preferably up to 90, up to 80, up to 70, 
up to 60 or up to 50 of such peptides. 

Further peptide sequences of the invention are as shown in the following. 
In these sequences the amino acid at positions 2, 6 or/and 9 each 
independently can be replaced by V, L, I or/and M. 
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Tabie 1 



catd human 



1.000000 
0.9993X3 
1.000000 
0.997769 
0.989067 
0.999877 
0.999934 

creb hu man 

0.916947 
0.998294 
0.989'879 
0.997264 
0.999142 
0.999527 
0.999975 
0.999923 

ctag_human 

0.999339 
0.999982 
0.999974 
0.999991 
0.998567 
0.999960 

erb2_huraan 

2 3 2 ; • ••*••• 

0.999709 
0.999802 
0.999996 
0.999996 
0.974558 
0.998018- 
0.611272 
0.999929 
0.996131 
0.947400 
0.999993 



YLSQDTVSV 
KLVDQNIFS 
LVDQNIFSF 
DQNIFSFYIi 
VTRKAYWQV 
QVHLDQVKV 
HLDQVEVAS 



ILNDLSSDA 
TTILQYAQT 
TILQYAQTT 
DVQTYQIRT 
AARKREVRL 
AVIiENQNKT 
VLENQNKTL 
TLIEELKAL 



ELARRSLAQ 
VLLKEFTVS 
NILTXRLTA 
ILTIRLTAA 
TIRLTAADH 
AADHRQLQL 



SFLQDIQEV 
LXAHNQVRQ 
QLFEDNYAL 
LFEDNYALA 
QLRSLTEIL 
TILWXDIFH 
ILWXDIFHK 
DIFHKNNQL 
KNNQLALTL 
NNQLALTLI 
QLALTLIDT 



150-158 

222- 230 

223- 231 
225-233 
264-272 
271-279 
273-281 



137-145 

219- 227 

220- 228 
248-256 
282-290 

316- 324 

317- 325 
324-332 



103-111 
121-129 

131- 139 

132- 140 
134-142 
139-147 



72-80 
85-93 

106- 114 

107- 115 
141-149 

166- 174 

167- 175 
171-179 

175- 183 

176- 184 
178-186 
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H T TTTVNJM'JTT" 
J\JJ V 1 XXI 1U A 
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349 


-357 
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— *t u o 


V7ETLESIT 
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423 
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-4 31 
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466 


-474 


LTSIISAW 
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LIKRRQQKI 


674 


-682 


RLLQETELV 


683 


-697 


ETSLRKVKV 


717 


-725 


AJKVLRENT 


751 


-759 


IiTSTVQLVT 


790 


-798 


STVQLVTQL 


792 


-800 


YLEDVRLVH 


835 


-843 


RLVHRDIAA 


840- 


-848 


DLAARNVLV 


845 


-853 


LLDIDETEY 


869- 


-877 


DIDETHYHA 


871-879 


SILRREFTH 


893- 


-901 


ILRRRTTHQ 


894- 


-902 


RFTHQSDW 


898- 


-906 


THQSDVWSY 


900- 


-908 


RFRELVSEF 


968- 


-976 


FWIQNEDL 


986- 


-934 


DLVDAEEYL 


10l6 r l0"24 


LVDAEEYIiV 


101-7-1025 


QVIWHNTI 


101- 


-109 


VIWVNNTII 


102- 


-110 


SWSQKRSFV 


142- 


-150 


SFVYVWKTW 


148- 


•156 


SVSVSQLRA 


216- 


•224 


YLAEADLSY 


250- 


•258 


VTAQWLQA 


286- 


•294 


TTAAQVTTT 


413- 


•421 


AAQVTTTEW 


415- 


•423 


VTTTEWVET 


418- 


•426 


SFSVTLDXV 


482- 


-490 


NVSLADTNS 


568- 


576 


SLADTNSIA 


570- 


578 


LADTNSIiAV 


571- 


579 


HSSSHWLRL 


632- 


640 



0.999957 
0.999927 
0.999967 
0.998428 
0.834140 
0/999736 
0.999164 
0.922432 
0.99.9993 
0.999985 
0.704970 
0.999930 
0.990487 
0.399361 
0.998516 
0.891765 
0.999969 
0.599802 
0.999964 
0.999992 
0.999993 
0.999885 
0.999956 
0.997836 
0.999221 
0.884728 
0.999962 
0.999439 
0.990193 
0.999995 



gpl00_htiman 



0.986531 
1.000000 
0.991766 
0.999969 
0.999897 
0.999997 
0.990549 
0.999610 
0.996788 
0.983375 
0.999911 
0.999597 
0.999882 
0.994679 
0.998051 



n^a.gel human 

0.999970 
0.999915 
1.000000 
0.927034 
0.978865 
0.999998 
0.996432 
0.999888 
0.999949 
0.988293 
0.999463 



ALEAQQEAL 
ILESLFRAV 
VITKKVADL 
ASESLQLVF 
KLLTQDLVQ 
LVQEK1TEY 
IAETSYVKV 
YVKVLEYVI 
KVXjEYVIKV 
YVTKVSARV 
KVSARVRFF 



15-23 
93-101 
101-109 
147-155 
237-245 
243-251 
271-279 
276-284 
278-286 
-282-290 
285-293 
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0.999985 
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0 .998253 
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mage5_human 






0.999973 


AXDFTLVTRQ 


74-82 


0.999964 


DFTLWRQSI 


76-84 


1.000000 


ALSKKVADL 


108-116 


0.999983 


KVADLIHFL 


112-120 


0.997569 


VADLIHFLL 


113-121 


1.000000 


LIHFLLLKY 


116-124 


mage6_human 
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AASSSSTLV 


38-46 


0.999920 


DLESEFQAA 


100-108 


1.000000 
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108-116 
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1.000000 


LVHFLLLKY 
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mage 8_human 



0.998794 
0.999999 
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mageA__human 



0.997990 
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1.000000 
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■ * • 1 1 r 1 1 : i ; 
mageB_imman 

0.996542 
0.999977 
1.000000 
0.999993 
1.000000 
0.999952 
0.999894 
0.999975 
0.989319 
0.999988 
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SVKEHRKIY 


92-100 


6 


0 


.998606 


VWNQQESS 


108-116 


7 


0 


.982799 


STSSRRRAI 


157-165 


15 


0 


.999689 


AISETEENS ; 


164-172 


12 


0 


.912657 


RHKSDSISL; 


183-191 


11 


0. 


.999287 


sislsfdes" 


188-196 


12 


0. 


.999913 


SLSFDESIA 


190-198 


15 


0. 


.998285 


SVSDQFSVE 


240-248 


7 


0. 


9999.60 


SVEFEVESL 


246-254 


20 


0. 


999996 


SIiDSEDYSL 


253-261 


25 


0. 


999888 


IIYSSQEDV 


403-411 


21 



mif human 



0.999946 


FLSELTQQL 


18-26 


0.999957 


ELTQQIAQA 


21-29 


0.942786 


LLAKRLRIS 


82-90 


p53_human 






0.999372 


ETFSDLWKL 


17-25 


0.999971 


TFSDLWKLL 


18-26 


0.916063 


OTFRHSVW 


210-218 


0.999934 


ALELKDAQA 


347-355 


tyr2_human ■ 






1.000000 


VTRQNIKSL 


125-133 


0.999786 


AIiDIAKKRV 


144-152 


0.999993 


SVYDFFVWL 


180-188 


0.999999 


FVWLHTYSV 


185-193 


0.999976 


FVTWHRYHL 


216-224 


0.994160 


VTWHRYHLL 


217-225 


0.999975 


TLISRNSRF 


271-279 


0.977456 


SRNSRFSSW 


274-282 


0.999984 


SLDDYNHLV 


- 288-296 


1.000000 


FFQNSTFSF 


339-347 


1.000000 


SLHNLVHSF 


367-375 
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0.999237 
0.999957 
0.995790 
0.955026 
0.990634 
0.955259 



tyro_human 

j 1 1 1 ; 1 1 1 1 * • 

0.999956 

0.999999 

1.000000 

0.999147 

0.999998 

0.999987 

0.987547 

0.999999 

0.995893 

0.999994 

0.999999 

0.999924 

0.997281 

0.999962 

0.998227 

0.927015 

0.979646 



IFWLHSFT 
WLHSFTDA 
VLHSFTDAI 
VTNEELFLT 
ELFLTSDQL 
HLSSKRYTE 



LLWSFQTSA 
RLLWRNIF 
LVRRNIFDL 
KFFAYLTLA 
YLTIAKHTI 
TLAKHTISS 
AKHTISSDY 
DINIYDLFV 
FLLRWEQEI 
SFFSSWQIV 
IFLLHHAFV 
LLHHAFVDS 
AFVDSIFEQ 
FVDSIFEQW 
SIFEQWLRR 
YLEQASRIW 
ASRIWSWLL 



391-399 

393- 401 

394- 402 
439-447 
443-451 
509-517 



9-17 

116-124 

118-126 

133-141 

137-145 

139-147 

141-149 

169-177 

214-222 

267-275 

385-393 

387-395 

391- 399 

392- 400 
395-403 
467-475 
471-479 



Further peptide sequences of the invention are as shown in the following. In 
these sequences the amino acids at positions 2, 6 or/and 9 each 
independently can be replaced by V, L, I or/and M. 
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Table 2 

Pos. Sequence modif ication IdentiCy-scores Comments 



BCL2_HOMAN 



154 


RIVAFFEFI 


G -> 


I 


Pos 


9 


187 


229 


137 


RFATWEEL 










127 


188 


188 


YLNRHLHTW 










124 


188 


CCEM HUMAN 














25 


RLLLTASLL 










203 


237 


26 


LliLTASLLT 










209 


237 


27 


LLTASLLTF 










210 


236 


28 


LTASLLTFW 










210 


236 


108 


IIYSNASLL 


P -> 


S 


Pos 


4 


183 


229 


CD19. 


.HUMAN 














427 


EFYENDSNL 










35 


44 


326 


VLRRKRKRI 


M -> 


I 


Pos 


9 


35 


44 


302 


AVTLAYliIF 










31 


41 


287 


VLWHWLLRT 










30 


41 



CGD1_HUMAN 

63 SLRKXVATW M -> L POS 2 431 709 

92 YLDRFLSLI E -> I Pos 9 491 656 

152 LVNKLKWNL 320 630 



CTAG_HOMAN 

129 VLLXEFTVS * 24 24 6,069,233 
J Exp Med 2000 Feb 21/191(4) :625-30 : 15 AS epitope for MHO II 

ERB2_HUMAN (Her-2) 

827 RLVHRDLAA 687 802 
6,037,135: 10 AA (RliVHRDLAA R) y Se<j-Zd 288 for HLA-A3.2 
6, 075, 122 : identical sequence patented Seq ID 18 

832 DLAARNVLV l/L at p os. 9 often t t 789 860 

6,075,122: identical sequence patented Seq ID 9 

885 RFTHQSDVW 611 817 
MUC1__HUMAN 

1049 SFFFLSFHI 42 42 

1139 RYNLTISDV 39 39 

1061 QFNSSLEDI P -> I Pos 9 44 44 

TRSR_HUMAN 

271 TFAEKVANA 219 287 

413 VIAQRDAWI G -> I Pos 2 + 9 232 312 

455 SIIFASWSA 251 332 

489 YINLDKAVL 222 * 293 

TYR2__HUMAN 

188 SVYDFFVWL 111 147 Cancer Res. 199 8; 5 8(21) :4 895 

193 FVWLHYYSV 124 148 
6,083,703: 10 AA peptide Seq-Id: 17; no activity seen in test 
6,132,980: s.o. 



224 FVTWHRYHL 
282 SRNSRFSSW 
351 STFSFRNAL 



128 168 
111 146 
106 144 
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CATD_HUMAN 725 811 

106 TISSNLWVI G. P "> I Pos 354 543 

272 VTRKAYWQV 456 g 02 

PM17_HUMAN 47 56 

258 YLAEADLSY 45 5g 

294 VTAQWLQA 48 56 
576 NVSLADTNS 

CREB_HUMAN 115 124 

141 SYRKILNDL 1Q4 1Q4 
325 VLENQNKTL 



P53_HUMAN 216 

25 ETFSDLWKL 281 
218 NTFRHSVW Q3 
257 RIH.TIITL P -> I Pos 2 295 ^OJ 
355 ALELKDAQA 

MIF_HUMAN 

26 FLSELTQQL 

MAG1_HUMAN ^ ... 

117 LVHFLLLKY G -> H Pos 3 == MAG2 126 143 

« 037 135 • sea- ID 1205; HLA-3 and 1 1 binding; no CTL response 

j' innanol " 1999^ Sep 1,163 (5) : 2 92S-36: 14>er with T^e.. response 

136 ILESVIKNY M -> I Po: 
129 ELVTKAEIL M -> I Poi 
p -> L Po: 

155 ASESLQLVF 
245 KLLTQDLVQ 
251 LVQENYIiEY K -> N Po 

279 LIETSYVKV A -> I Pos 2 ZZZa^J* 
6,147,187: Ser-XD 11; HLA-2.1 -> clearly claxmed 





MAGA 


111 138 




MAGA 


130 150 




MAG 4 








112 135 






117 130 




MAG2 


119 137 




MAG2 


103 130 


-> 


clearly 


claimed 



Further peptide sequences of the invention are as shown in the following. In 
these sequences the amino acids at positions 2, 6 or/and- 9 each 
independently can be replaced by V, L, I or/and M. 
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Table 3 



rrotem (owiss- 
Prot-ID) 


Peptide 
sequence 


rosixion in 
;the protein 


Note 


I 


VGR3_HUMAN 


DLAARNILL 


1037-1045 








TTQSUVWbr 


■i no •! h r\n 

i uy^-i 1 UO 


- 






VLLWEIFSL 


1102-1110 






VEGF_HUMAN 


TLVDfFQEY 


57-65 






CD34_HUMAN 


ILDFTEQDV 


272-280 






• 


TLIALVTSI 


290-298 


at pos9: G - 
>l 


i 

i 





Tir>ATCDMI 

I IUAIoKNI 


004-0/ Z : 


ax posy. - 


j 

i 


ETS1_HUMAN 


QLWQFLLEL 


336-344 




s 

t 
! 


PEC1_HUMAN 


VIVNNKEKT 


111-119 




>> 

i 




IIIQKDKA! 


270-278 




' ! 
I 




SIWNITEL 


316-324 






MDM2_HUMAN 


SVKEHRKIY 


92-100 






MM01_HUMAN 


HLTYRIENY 


113-121 








AFQLWSNVT. 


137-145 








LHRVAAHEL 


212-220 







Further peptide sequences of the invention are as shown in the following. 
In these sequences the amino acid at positions 2, 6 or/and 9 each 
independently can be replaced by V, L, I or/and M. 
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Table 4 



- 16 



conservation conservation 



position 


sequence Filter 


rrp2 






442-450 


RHNYFTAEV 


1 


653-667 


SAESRKLLI* 


1 


510-518 


hlrndtdw 


1 


701-709 




1 


417-425 


Y.ThCTUTUT. 


1 


420-428 


^ TT.I T L'T .TYCT 

jjlWlbi'L'al 


-J 


638-646 




1 


rrp3 






548-554 




1 


736-744 


KRKRNSSILi 


i 


496-504 


VSlDRTIiKV 


1 


228-234 


SVYIEVtoEli 


1 


19-27 


ILT2CTTVDH 


1 




SVLVNTYQW 


1 


hem a 






51-59 


KVTNATELV 


1 


385-393 


STQAATDQI 


1 


435-443 


DLWSYNAEL 


1 


463-471 




1 


245-253 




1 


447-455 


LENQHTIDL 


1 


382-390 


DLKSTQAAI 


1 


380-388 


AADtiKSTQA 


1 


vmtl 






153-161 


QIADSQHRS 


1 


180-188 


VLASTTAXA 


1 


232-240 


DliliENLQAY 


1 


102-110 


KLKREITFH 


1 



score 

253 
262 
264 
247 
230 
232 
237 



306 
274 
302 
303 
306 
304 



679 
767 
720 
668 
656 
715 
755 
748 



155 
162 
155 
149 



score 

272 
271 
268 
268 
268 
268 
267 



327 
326 
325 
325 
324 
324 



825 
818 
817 
815 
815 
-B10 
800 
800' 



179 
177 
171 
171 



-ANN score 

0.732 
0.873 
0.753 
0.973 
0.948 
0.980 
0.900 



0.982 
0.860 
0.502 
0.992 
0.965 
0.992 



0.618 
0.798 
0.985 
0.925 
0.969 
0.933 
0.837 
0.741 



0.738 
0.990 
0.953 
0.555 



vrnt2 



35-43 


ILHLXXWIL 


1 


83-91 


AVDADDSHP 


1 


3947 


ILWXLDHLF 


1 


nram 






217-225 


SWSKNIDRX 


1 


438-446 


WTSNSIWF 


1 


437-445 


WWTSNSIW 


1 


435-443 


RWWTSNSI 


1 


389-397 


KLQIHRQVI 


1 


222-230 


IIiRTQESEC 


0 


02-10 


NPNQKIITI 


0 


vnb 






28-36 


SFTVUTVF 


1 


03-11 


NATFNYTNV 


1 



129 
24 



380 
309 
305 
287 
245 
473 
416 



94 
96 



143 
142 
142 



462 
436 
416 
406 
356 
492 
429 



93 
96 



0.998 
0.989 
0.973 



0.995 
0.967 
0.895 
0.961 
0.984 
0.993 
0.949 



0.998 
0.913 
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Particularly preferred are peptides VTAQVVLQA, VLAQVVLQL, 
LVHFLLLKY, LLHFLLLKL, FVWLHYYSV or FLWLHYYSL, which showed 
particularly high activity in step (b) as well as variants generated by AA 
exchange at position 2, 6 and/or 9, e.g. by V, L, I or M. 

The invention further relates to the use of the peptides found by the 
method of the invention for the production of a pharmaceutical for the 
induction of cytotoxic T-lymphocytes, in particular, for the prevention, 
treatment or diagnosis of cancer or viral infections. 

The invention and the individual procedural steps will be explained in detail 
below. 

HLA-restricted specific epitopes recognized by cytotoxic T cells are 
peptides of defined sequences of amino acids and can be characterized 
with artificial intelligence and pattern recognition methods in combination 
with additional filters and optimization steps described herein. The 
predicted epitope peptides can be verified with biological assays for tumor 
or virus antigen-specific T cell activities using peripheral white blood cells 
of patients as source for the specific T cells. A composition of 
HLA-restricted specific antigenic peptides (1-100) for a particular virus or 
tumor together with adjuvants as CD4+ helper T cell activators can be 
used for effective vaccination. 

A number of HLA-restricted tumor-specific epitopes and antigenic peptides 
for various cancers and viruses detected with the method of this invention 
is attached in the Tables. 

Procedure: 

a) Prediction of MHC-I specific epitopes 

Generation of a prediction tool for MHC-I binding and/or T-cell 
activation. This can be done by using any state of the art 
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technology for structure activity relationship (SAR) model 
generation, like ANN'S, support vector machines (SVM's), SIMCA P, 
partial least squares projection to latent structures (PLS) etc.. As the 
basis for the application of these technologies a representative data 
set of peptides is used. This dataset, e.g., consists of peptides, 
known to bind to a given MHC-I molecule, e.g. those stored within 
the SYFPEITH database (Hans-Georg Rammensee, Jutta Bachmann, 
Niels Nikolaus Emmerich, Oskar Alexander Bachor, Stefan 
Stevanovic: SYFPEITHI: database for MHC ligands and peptide 
motifs. Immunogenetics (1999) 50: 213-219) and peptides, that do 
not bind. Due to the fact, that there is only limited data on 
experimentally proven not-binding peptides a set of randomly 
generated peptides can be used for model generation, e.g. all 
epitopes, that can be generated out of the p53 protein. In this 
particular case ANN's were trained for HLA-0201; HLA-0101; 
HLA-1 1 01 , based on the epitopes given in SYFPEITH database using 
an evolutionary algorithm for optimization of weights and biases 
within the neural network. The criteria for using a generated SAR 
model for epitope prediction is the prediction quality of said model 
on a test dataset, that has not been used for training. The neural 
networks used within the next steps of this inventions were able to 
correctly assign almost all test data to the corresponding class 
(binding, not-binding). 

Selection of the relevant antigenic proteins for various cancers and 
viruses. 

This is done according to current state of the art technology and 
knowledge. The following criteria can be used for selection: 

Proteins, described in literature as source of tumor 

associated antigens 

Proteins, involved in apoptotic processes, e.g. p53 
Proteins, belonging to tumor testis antigens and embryonic 
antigens, e.g. MAGE, BAGE, GAGE, CEA, AFP 
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Proteins, that are expressed in specific tissues, e.g. 
tyrosinase 

A procedure defining the degree of conservation for each potential 
5 epitope within the protein sequence, in particular, a procedure 

selecting (phylogenetically) conserved regions within a protein 
sequence. 

This procedure consists of 3 steps: 

1. Performing a similarity search against protein and/or nucleic 
10 acid data bases containing human and/or non-human 

sequences, e.g. by using BLAST (Altschul, Stephen F., 
Thomas L. Madden, Alejandro A. Schaffer, Jinghui Zhang, 
Zheng Zhang, Webb Miller, and David J. Lipman (1997), 
"Gapped BLAST and PSI-BLAST: a new generation of protein 
15 database search programs", Nucleic Acids Res. 

25:3389-3402) FASTA or any other available tool. 
See example in figure 1 

2. Defining a similarity cutoff, e.g. when using BLASTP the 
"expect threshold" can be set to 1e-30. Only those proteins 

2 0 with a similarity higher then the selected cutoff are used to 

perform step 3. 

3. Calculating the degree of conservation for each potential 
epitope. For this, the complete sequence of the selected 
tumor antigen is chopped into overlapping 9-mers (8-mers, 

2 5 10-mers). For each of these epitopes a conservation score is 

calculated. This can be done by simply summing up the 
number of identical AA between the selected antigenic 
protein and the identified homologue proteins over all epitope 
positions. Alternatively substitution matrices, e.g. BLOSUM, 

30 PAM etc. (see. Altschul et.al.) can be used. 

An example is given in figure 2. 
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A procedure generating all possible peptide variants out of each 
epitope within the selected tumor antigen, by exchanging the natural 
amino acid at certain anchor residues by more preferred amino 
acids. In particular, an optimization step where amino acids (AA) 
within the so-called anchor residues of the MHC-I binding peptides 
are being exchanged. This procedure consists of 3 steps: 

1 . Based on the knowledge about known epitopes (Hans-Georg 
Rammensee, Jutta Bachmann, Niels Nikolaus Emmerich, 
Oskar Alexander Bachor, Stefan Stevanovic: SYFPEITHI: 
database for MHC ligands and peptide motifs. 
Immunogenetics (1999) 50: 213-219) or by using the so 
called "virtual alanine scan" technology (see 
PCT/EP01/14808) or by using any other technology the so- 
called "anchor residues" are identified. These are the 
positions within the epitope, that are most important for 
binding to the given MHC receptor. 

2. Moreover, by applying the same technologies, those AA, that 
are most preferable in these anchor positions are identified, 
e.g. for HLA-0201 the anchor position are position 2 and 9 
with L , M, V and I (isoleucine) most preferred in the 
corresponding positions (according to Rammensee et al.). 
These preferred AA can also belong to the group of 
non-natural AA. 

3. The last step comprises the in silico generation of all possible 
peptide variants, e.g. for each epitope there are 8 peptide 
variants in case of 2 anchor residues with 2 different 
preferred amino acids each. These peptides are only virtually 
generated, so no peptide synthesis has to be applied at this 
stage of the process. When including non-natural AA so 
called peptidomimetics are generated. 

Evaluation of all potential epitopes generated within the previous 
steps by the SAR model, e.g. ANN's trained in step 1 . In particular, 
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prediction of CD8+ T-cell epitopes, e.g. within an ANN. According 
to the results of the prediction the epitopes are ranked. 
The selection (filtering) of epitopes out of the ranked list is 
preferably done according to the following criteria: 

1. SAR model predict high MHC-I binding for the epitope, 
preferably the highest. 

2. The epitope is predicted to bind to more then one MHC-I 
molecule. 

3. The epitope has high conservation score, preferably the 
highest among all epitopes of a given tumor antigen. 

4. The epitope has the following properties: 

a. The epitope do not contain any of the following amino 
acids: P, M, C, G. 

b. The epitope does not contain four of the aliphatic 
amino acids (I; L;) in line, e.g ILLL is filtered out, but 
ILLFL is permitted. 

c. The epitope do not contain the sequences PEST in a 
line. 

b) Verification of the predicted epitope peptides 

Verification of the predicted epitope with synthetic peptides and 
assays for the cytolytic activity and anti-tumor or anti-virus efficacy 
of the epitope-specific T cells using peripheral white blood cells of 
patients as source of specific T cells. 

Those epitope selected according to part a) of the procedure are 

synthesized with standard procedures and tested in an in vitro assay, e.g. 

as described in PCT/DE99/00175 and Kern F. et al. Nature Medicine. 

(1998) 4(8):975-8,T-cell epitope mapping by flow cytometry. Those 

epitopes, that cause a specific T cell reaction within this assay are further 

developed into step c). 
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c) Vaccination with predicted epitopes 

Generation of vaccines that consist of 1-100, preferably 2-90, more 
preferably 5-80 and most preferably 10-50 relevant peptides as 
identified by a) and/or b) and optionally specific recall antigens as 
adjuvants for CD4* T cell stimulation and for induction of 
co-stimulation for the peptide and disease-specific CD8* cytotoxic T 
cells (CTL) or with adjuvants, co-factors or general CD4* T-cell 
stimulation antigens for co-stimulation of CD8* CTLs. 
In principle the epitopes identified within step a and b can be used in 
several vaccination strategies and are as such not restricted to the 
one mentioned above. 

Vaccination, in particular, intracutaneous or parenteral vaccination in 
humans with the vaccine pool. 

There are two patents claiming the application of ANN for the prediction of 
MHC binding motifs of biologically active peptides and peptide mimetics 
(DE 198 26 442, WO 98/53407 C2). 

The method presented within this invention preferably combines the 
application of ANN with two additional steps: 

An optimization step where amino acids (AA) within the so-called 
anchor residues of the MHC-I binding peptides are being exchanged 
A procedure selecting conserved regions within a protein sequence. 

The optimization and the selection procedure can apply knowledge and/or 
computer-based algorithms. 

This invention provides the following advantages in comparison to 
previously described methods for T-cell epitope prediction: 
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The epitopes yielding highest CTL response in most human 
individuals will be the least variable ones and therefore be of the 
highest pharmacological relevance. 

The specific optimization step will improve* the MHC-binding 
properties of the peptides without affecting the biological activity of 
the peptide. The application of this optimization procedure to all 
9-mers (8-mers, 10-mers) of a given tumor antigen allow the 
identification of previously not identified epitopes and mimitopes. 
Further, it is possible to bbtain biologically active peptides that differ 
from naturally occurring sequences. 

The parallel prediction of binding to several different MHC-I 
molecules allows the identification of epitopes, that have a 
significant higher application potential. 

The application of knowledge based filters (PEST sequences; non 
tolerated amino acids etc.) increase the probability of biological 
effects and application potential. 

The usage of in vitro assays for the verification of the epitopes that, 
based on the biological reactivity of cytotoxic T cells of cancer or 
virus infection patients, ensures detection of disease-relevant 
specificities 

The usage of state of the art pattern recognition technologies in 
combination with the afore mentioned steps yield in a higher 
prediction accuracy. 

For vaccination, 1-100 peptides related to a particular virus or 
cancer, will be used as a vaccine. Additionally, specific co-factors, 
adjuvants and CD4+ T-cell antigen for co-stimulation of CD8+ T- 
cells will be included. This can be applied intracutaneously, 
parenterally, etc. 

Fig. 1 schematically shows a similarity search, and 

Fig. 2 shows an example of calculation of conservation scores. 
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Examples 
Example 1 

The performance of the method of the invention will be explained in the 
following by way of an example. 

First, an antigenic protein is selected, e.g. from a database. In the case of 
this example, a protein having 509 amino acids is chosen as an antigenic 
protein. Said protein is is fragmented virtually (by computer) to give 500 
peptides having a length of 9 amino acids each. A conservation score is 
determined for each of these 9-mers. In the subsequent optional step 
anchor positions and preferred amino acids at these positions are 
determined. In the case of this example it is assumed that anchor positions 
are at positions 2 and 9 and 2 optimal amino acids each are described in 
the prior art for each position. This leads to 8 variants for each 9-mer, so 
a total of 4,500 epitopes are present (8 variants and 1 original). These 
epitopes are now tested as to whether they are CD8 + T-cell epitopes by 
means of a pattern recognition technology, e.g. SAR and ANN, 
respectively. In particular, MHC binding capacity can be determined this 
way. 

Assuming it is found that 300 epitopes are effective, the conservation 
score of these 300 epitopes is now used to determine the best 100 
epitopes. 

Subsequently, a filter can be used which sorts out particular peptides, e.g. 
peptides containing proline (because of unfavorable folding) and peptides, 
in the case of which synthesis problems are to be expected. 

In this way the number of epitopes can be further reduced, e.g. to 50. 
These 50 epitopes can now be verified in an in vitro assay for their 



5 
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activity. Part or all of the peptides verified as being active can then be 
pooled and used as a vaccine. 

Example 2 

In vitro verification of the T-cell activation functionality of peptides 
identified or optimized, respectively, according to the invention. 



Peptide sequence 


Source protein 


Frequencies reactive CD8+ T cells 


Melanoma 


Cutaneous T-cell 
lymphoma 


VTAQWLQA 


GP100 


0,08 


0,04 


VLAQWLQL 


GP100 optimized 


0,18 


0,12 


LVHFLLLKY 


MAGE 


0,99 


0,03 


LLHFLLLKL 


MAGE optimized 


1,10 


0,03 


FVWLHYYSV 


TYPv2 


1,01 


0,01 


FLWLHYYSL 


TYR2 optimized 


0,82 


0,02 


Control 




0,10. 


0,02 
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Claims 

1 . A method for providing, identifying or/and optimizing peptides which 
induce cytotoxic T-lymphocytes, comprising the steps: 

(a) selecting one or more antigenic proteins, 

(b) selecting conserved regions within the protein sequence of 
the one or more antigenic proteins, and 

(c) identifying CD8 + T-cell epitopes within the protein sequence 
of the one or more antigenic proteins. 

2. The method according to claim 1, further comprising the step: 

(d) optimizing the identified CD8 + T-cell epitopes by exchanging 
one or more amino acids at the anchor positions thereof, 

3. The method according to claim 2, wherein step (d) is performed 
prior to step (c). 

4. The method according to any of the preceding claims, wherein step 
(c) is performed using an artificial neural network. 

5. The method according to any of the preceding claims, wherein in 
step (a) one or more antigenic proteins for cancer or/and a virus are 
selected. 

6. The method according to any of claims 1 to 5, wherein peptides 
having 4 to 30 amino acids are obtained. 



7. 



The method according to any of the preceding claims, wherein an 
additional filtering step is applied. 
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8. The method according to any of claims 1 to 7, further comprising 
the step: 

verification of the activity of the identified or/and optimized 
peptides in vitro. 

9. Pharmaceutical composition comprising one or more peptides which 
induce cytotoxic T-lymphocytes obtainable according to the method 
of any of claims 1 to 8. 

10. The pharmaceutical composition according to claim 9, further 
comprising adjuvants, co-factors and/or co-stimulating agents. 

11. A method for providing a pharmaceutical composition for the 
induction of cytotoxic T-lymphocytes, comprising: 

(a) providing one or more peptides which induce cytotoxic T- 
lymphocytes according to the method of any of claims 1 to 8, 
and 

(b) using the one or more peptides for the manufacture of a 
pharmaceutical composition. 

1 2. Isolated peptide as depicted in any of Tables 1 , 2/3 or 4, including 
the variants generated by AA exchange at positions 2, 6 and/or 9. 

13. Isolated peptide having the formula VTAQVVLQA, VLAQVVLQL, 
LVHFLLLKY, LLHFLLLKL, FVWLHYYSV or FLWLHYYSL, including 
the variants generated by AA exchange at positions 2, 6 and/or 9. 

14. Pharmaceutical composition comprising one or more peptides 
according to claim 12 or 13. 
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15. Use of a peptide according to claims 12 or 13 or obtainable 
according to the method of any of claims 1 to 8 for the manufacture 
of a pharmaceutical for the induction of cytotoxic T-Iymphocytes. 



5 16. 



Use according to claim 1 5 for the prevention, treatment or diagnosis 
of cancer or viral infections. 
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Figure 1 : 

Similarity search with selected tumor associated antigen using BLASTP against 
SWISS-PROT 



BLASTP 2.1.3 
Reference* 

Altschul, Stephen F. , Thomas L. Madden, Alejandro A Schaffer, 
Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. ' 
"Gapped BLAST and PSI-BLAST: a new generation of protein database search 
programs-, Nucleic Acids Res. 25:3389-3402. 

Q uery = (385 letters) 

Database: No n- redundant SwissProt sequences 

96,469 sequences; 35,174,128 total letters 

Score E 

Sequences producing significant alignments: (bits) Value 

CD34_ROMAN HEMATOPOIETIC PROGENITOR CE. . . 543 e-154 

CD34_CANFA HEMATOPOIETIC PROGENITOR CE. . . 359 8e-99 

CD34J40USE HEMATOPOIETIC PROGENITOR CE. . . 349 9e-96 

Alignments ^ f p^^j j ^gj j j i pgQ FI ^5LDNNGTATPELPTQGTFSNVSTNVSYQETTTPSTLGSTSLHPVS 76 

3183511 17 ' TTTPSTLGSTSL . . . . 76 

2498215 17 ! ! F. TNTETVI . P. TVPTSTEIM.A. . E.T.KR. AITLTPSGTTTLYS . . 76 

3182M6 17 :v:::: M :::- H.N.LT S ..T.TS...ISPS.P..E.VE.NITSSIPGSTSHYLIY 71 

1 77 QHGNEATTNITETTVKFTSTSVITSVYGNTNSSVQSQTSVISTVFTTPANVSTPETTLKP 136 

3183511 77 NTNSSVQSQTSVISTVFTT 13b 

2498215 77 ' DSSGT* AT* S ' V. . HV. . . . E. .LTP.TMNSSVQSQTSLAITVSFT .T . F. . SSV. .E. 136 

3182946 72 !dSSot!pa!s! !m!n. .V. .G.P.GS.TPHTFSQPQTSPTGILPTTSDSI. .S.M.W.S 131 

1 137 SLSPGN V— SDLSTTSTSLATS PTKPYT S SS PILS DIKAEIKCSGIREVKLTQG 188 

3183511 137 . — SDLSTTSTSLATSPTKPYTSSSP 1°8 

2498215 137 . . L. . .GSDPFYN— STSLVTSPTEYYTSLSPTPSRNDTP.T. .G VK.-.-.II-. 194 

3182946 132 ..PSI. . SDYSPNNSSFEMTSPTEPYAYTSSSAP. A. .G * J ^ 

1 189 ICLEQNKTS S CAEFKKDRGEGLARVLCGEEQADADAGAQVCS LLLAQS EVRPQCLLLVLA 248 

3183511 189 " * * 9 „ 

2498215 195 . . . . L.E. . . .ED. . . .NE.K.TQ. . . .KEP.E. . .G H ... . . 

3182946 186 . . . .LSEA. . .E. . . .EK. .D.IQI . .EK.E.E S E...M... 24* 

1 24 9 NRT E I SSKLQLMKKHQ S DLKKLG ILD FTEQD V AS HQS YSQKTL IALVT SG ALLAVLG ITG 308 

3183511 249 " 1 " * * 312 

2498215 253 .K. .LF LR R * v t'V" 305 

3182946 246 . S. .LP E R. . . -QS.NK. .IG R V. . . I. .T. . 

1 309 YFLMNRRSWSPTGERLGEDPYYTENGGGQGYSSGPGTSPEAQGKASVNRGAQENGTGQAT 368 

3183511 309 GGGQGYSSGPGTS - 368 

2498215 313 GGGQGYSSGPGVS . .......... .P 372 

3182946 306 GGGQGYSSGPGAS . .T. . . .N.T 365 

1 369 SRNGH S ARQH WA DTE L 385 

3183511 369 385 

2498215 373 M 389 

3182946 366 382 



Database: Non-redundant SwissProt sequences 

Posted date: May 11, 2001 5:54 AM 
Number of letters in database: 35,174,128 
Number of sequences in database: 96,4 69 

Lambda K H 
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0.312 0.128 0.357 
Gapped 

Lambda K H 

0.267 0.0410 0.140 



Matrix: BLOSUM62 

Gap Penalties: Existence: 11, Extension: 1 

Number of Hits to D3: 19900574 

Number of Sequences: 964 69 

Number of extensions: 647916 

Number of successful extensions: 1005 

Number of sequences better than 10.0: 4 

Number of HSP's better than 10.0 without gapping: 3 

Number of HSP's successfully gapped in prelim test: 1 

Number of HSP's that attempted gapping in prelim test: 998 

Number of HSP's gapped (non-prelim): 4 

length of query: 385 

length of database: 35,174,128 

effective HSP length: 56 

effective length of query: 329 

effective length of database: 29,771,864 

effective search space: 9794 943256 

effective search space used: 9794943256 

T: 11 

A: 40 

XI: 16 { 7.2 bits) 
X2: 38 (14.6 bits) 
X3: 64 (24.7 bits) 
SI: 42 (21.9 bits) 
S2: 66 (30.0 bits) 
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Figure 2: 

Calculation of 2 different conservation scores for all possible epitopes within position 25 - 78 of the 
query molecule CD34_HUMAN. when using BLASTP as shown in figure 1.. 

CD34_HUMAN HEMATOPOIETIC PROGENITOR CE. . . 543 e-154 
CD34_CANFA HEMATOPOIETIC PROGENITOR CE. . . 359 8e-99 
CD34 MOUSE HEMATOPOIETIC PROGENITOR CE... 349 9e-96 
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